115 research outputs found
Identifying the Context Shift between Test Benchmarks and Production Data
Machine learning models are often brittle on production data despite
achieving high accuracy on benchmark datasets. Benchmark datasets have
traditionally served dual purposes: first, benchmarks offer a standard on which
machine learning researchers can compare different methods, and second,
benchmarks provide a model, albeit imperfect, of the real world. The
incompleteness of test benchmarks (and the data upon which models are trained)
hinder robustness in machine learning, enable shortcut learning, and leave
models systematically prone to err on out-of-distribution and adversarially
perturbed data. The mismatch between a single static benchmark dataset and a
production dataset has traditionally been described as a dataset shift. In an
effort to clarify how to address the mismatch between test benchmarks and
production data, we introduce context shift to describe semantically meaningful
changes in the underlying data generation process. Moreover, we identify three
methods for addressing context shift that would otherwise lead to model
prediction errors: first, we describe how human intuition and expert knowledge
can identify semantically meaningful features upon which models systematically
fail, second, we detail how dynamic benchmarking - with its focus on capturing
the data generation process - can promote generalizability through
corroboration, and third, we highlight that clarifying a model's limitations
can reduce unexpected errors. Robust machine learning is focused on model
performance beyond benchmarks, and as such, we consider three model organism
domains - facial expression recognition, deepfake detection, and medical
diagnosis - to highlight how implicit assumptions in benchmark tasks lead to
errors in practice. By paying close attention to the role of context,
researchers can design more comprehensive benchmarks, reduce context shift
errors, and increase generalizability
The antiobesity factor WDTC1 suppresses adipogenesis via the CRL4 WDTC 1 E3 ligase
Abstract WDTC1/Adp encodes an evolutionarily conserved suppressor of lipid accumulation. While reduced WDTC1 expression is associated with obesity in mice and humans, its cellular function is unknown. Here, we demonstrate that WDTC1 is a component of a DDB1âCUL4âROC1 (CRL4) E3 ligase. Using 3T3âL1 cell culture model of adipogenesis, we show that disrupting the interaction between WDTC1 and DDB1 leads to a loss of adipogenic suppression by WDTC1, increased triglyceride accumulation and adipogenic gene expression. We show that the CRL4WDTC1 complex promotes histone H2AK119 monoubiquitylation, thus suggesting a role for this complex in transcriptional repression during adipogenesis. Our results identify a biochemical role for WDTC1 and extend the functional range of the CRL4 complex to the suppression of fat accumulation
A massive nebula around the Luminous Blue Variable star RMC143 revealed by ALMA
The luminous blue variable (LBV) RMC143 is located in the outskirts of the
30~Doradus complex, a region rich with interstellar material and hot luminous
stars. We report the sub-millimetre detection of its circumstellar
nebula with ALMA. The observed morphology in the sub-millimetre is different
than previously observed with HST and ATCA in the optical and centimetre
wavelength regimes. The spectral energy distribution (SED) of RMC143 suggests
that two emission mechanisms contribute to the sub-mm emission: optically thin
bremsstrahlung and dust. Both the extinction map and the SED are consistent
with a dusty massive nebula with a dust mass of
(assuming ). To date, RMC143 has the most
dusty LBV nebula observed in the Magellanic Clouds. We have also re-examined
the LBV classification of RMC143 based on VLT/X-shooter spectra obtained in
2015/16 and a review of the publication record. The radiative transfer code
CMFGEN is used to derive its fundamental stellar parameters. We find an
effective temperature of ~K, luminosity of log, and a relatively high mass-loss rate of ~yr. The luminosity is much lower than previously
thought, which implies that the current stellar mass of is
comparable to its nebular mass of (from an assumed
gas-to-dust ratio of 100), suggesting that the star has lost a large fraction
of its initial mass in past LBV eruptions or binary interactions. While the
star may have been hotter in the past, it is currently not hot enough to ionize
its circumstellar nebula. We propose that the nebula is ionized externally by
the hot stars in the 30~Doradus star-forming region.Comment: Paper accepted by A&A on 09/05/2019 and in proof stage. Second
comments by referee are included in this versio
Improving dermatology classifiers across populations using images generated by large diffusion models
Dermatological classification algorithms developed without sufficiently
diverse training data may generalize poorly across populations. While
intentional data collection and annotation offer the best means for improving
representation, new computational approaches for generating training data may
also aid in mitigating the effects of sampling bias. In this paper, we show
that DALLE 2, a large-scale text-to-image diffusion model, can produce
photorealistic images of skin disease across skin types. Using the Fitzpatrick
17k dataset as a benchmark, we demonstrate that augmenting training data with
DALLE 2-generated synthetic images improves classification of skin
disease overall and especially for underrepresented groups.Comment: NeurIPS 2022 Workshop on Synthetic Data for Empowering ML Researc
Mathematical and computational models of drug transport in tumours
The ability to predict how far a drug will penetrate into the tumour microenvironment within its pharmacokinetic (PK) lifespan would provide valuable information about therapeutic response. As the PK profile is directly related to the route and schedule of drug administration, an in silico tool that can predict the drug administration schedule that results in optimal drug delivery to tumours would streamline clinical trial design. This paper investigates the application of mathematical and computational modelling techniques to help improve our understanding of the fundamental mechanisms underlying drug delivery, and compares the performance of a simple model with more complex approaches. Three models of drug transport are developed, all based on the same drug binding model and parametrized by bespoke in vitro experiments. Their predictions, compared for a âtumour cordâ geometry, are qualitatively and quantitatively similar. We assess the effect of varying the PK profile of the supplied drug, and the binding affinity of the drug to tumour cells, on the concentration of drug reaching cells and the accumulated exposure of cells to drug at arbitrary distances from a supplying blood vessel. This is a contribution towards developing a useful drug transport modelling tool for informing strategies for the treatment of tumour cells which are âpharmacokinetically resistantâ to chemotherapeutic strategies
SelfClean: A Self-Supervised Data Cleaning Strategy
Most benchmark datasets for computer vision contain irrelevant images, near
duplicates, and label errors. Consequently, model performance on these
benchmarks may not be an accurate estimate of generalization capabilities. This
is a particularly acute concern in computer vision for medicine where datasets
are typically small, stakes are high, and annotation processes are expensive
and error-prone. In this paper we propose SelfClean, a general procedure to
clean up image datasets exploiting a latent space learned with
self-supervision. By relying on self-supervised learning, our approach focuses
on intrinsic properties of the data and avoids annotation biases. We formulate
dataset cleaning as either a set of ranking problems, which significantly
reduce human annotation effort, or a set of scoring problems, which enable
fully automated decisions based on score distributions. We demonstrate that
SelfClean achieves state-of-the-art performance in detecting irrelevant images,
near duplicates, and label errors within popular computer vision benchmarks,
retrieving both injected synthetic noise and natural contamination. In
addition, we apply our method to multiple image datasets and confirm an
improvement in evaluation reliability
Towards Reliable Dermatology Evaluation Benchmarks
Benchmark datasets for digital dermatology unwittingly contain inaccuracies
that reduce trust in model performance estimates. We propose a
resource-efficient data cleaning protocol to identify issues that escaped
previous curation. The protocol leverages an existing algorithmic cleaning
strategy and is followed by a confirmation process terminated by an intuitive
stopping criterion. Based on confirmation by multiple dermatologists, we remove
irrelevant samples and near duplicates and estimate the percentage of label
errors in six dermatology image datasets for model evaluation promoted by the
International Skin Imaging Collaboration. Along with this paper, we publish
revised file lists for each dataset which should be used for model evaluation.
Our work paves the way for more trustworthy performance assessment in digital
dermatology.Comment: Link to the revised file lists:
https://github.com/Digital-Dermatology/SelfClean-Revised-Benchmark
Art and the science of generative AI: A deeper dive
A new class of tools, colloquially called generative AI, can produce
high-quality artistic media for visual arts, concept art, music, fiction,
literature, video, and animation. The generative capabilities of these tools
are likely to fundamentally alter the creative processes by which creators
formulate ideas and put them into production. As creativity is reimagined, so
too may be many sectors of society. Understanding the impact of generative AI -
and making policy decisions around it - requires new interdisciplinary
scientific inquiry into culture, economics, law, algorithms, and the
interaction of technology and creativity. We argue that generative AI is not
the harbinger of art's demise, but rather is a new medium with its own distinct
affordances. In this vein, we consider the impacts of this new medium on
creators across four themes: aesthetics and culture, legal questions of
ownership and credit, the future of creative work, and impacts on the
contemporary media ecosystem. Across these themes, we highlight key research
questions and directions to inform policy and beneficial uses of the
technology.Comment: This white paper is an expanded version of Epstein et al 2023
published in Science Perspectives on July 16, 2023 which you can find at the
following DOI: 10.1126/science.adh445
- âŠ